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Abstract: While generalized abstract datatypes (GADT) are now considered well-understood, 
adding them to a language with a notion of subtyping comes with a few surprises. What does 
it mean for a GADT parameter to be covariant? The answer turns out to be quite subtle. It 
involves fine-grained properties of the subtyping relation that raise interesting design questions. 
We allow variance annotations in GADT definitions, study their soundness, and present a sound 
and complete algorithm to check them. Our work may be applied to real-world ML-like languages 
with explicit subtyping such as OCaml, or to languages with general subtyping constraints. 

Key-words: subtyping, datatypes, variance 
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GADT avec sous- ty page 



Resume : Les types algobriqucs generalises {Generalized Algebraic Datatypes, 
GADT) sont maintenant bien compris, mais leur ajout a un langage equipe de sous- 
typage nous reservait quelques surprises. Qu'est-ce qu'etre covariant pour un pa- 
rametre de GADT ? La reponse s'avere difficile. Elle met en jeu des proprietes fines 
de la relation de sous-typage qui soulevent d'interessantes problematiques de concep- 
tion de langage. Nous permettons des annotations de variance dans les definitions 
de GADT, etudions leur correction, et presentons un algorithme correct et complet 
pour les verifier. Notre travail peut s'appliqucir a un langage complet inspire de 
ML et avec sous-typage explicite, tel que OCaml, ou meme a des langages avec des 
contraintes generates de sous-typage. 

Mots-cles : sous-typage, types de donnees, variance 
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1 Motivation 

In languages that have a notion of subtyping, the interface of parametrized types 
usually specifies a variance. It defines the subtyping relation between two instances 
of a parametrized type from the subtyping relations that hold between their param- 
eters. For example, the type a list of immutable lists is expected to be covariant: 
we wish a list < cr' list as soon as cr < a' . 

Variance is essential in languages whose programming idioms rely on subtyping, 
in particular object-oriented languages. Another reason to care about variance is its 
use in the relaxed value restriction |Gar04) : while a possibly-effectful expression, also 
called an expansive expression, cannot be soundly generalized in ML — unless some 
sophisticated enhancement of the type system keeps track of efFcctful expressions — it 
is always sound to generalize type variables that only appear in covariant positions, 
which may not classify mutable values. This relaxation uses an intuitive subtyping 
argument: all occurrences of such type variables can be specialized to _L, and any- 
time later, all covariant occurrences of the same variable (which are now _L) can be 
simultaneously replaced by the same arbitrary type t, which is always a supertype 
of _L. This relaxation of the value-restriction is implemented in OCaml, where it is 
surprisingly useful. Therefore, it is important for extensions of type definitions, 
such as GADT, to support it as well through a clear and expressive definition of 
parameter covariance. 

For example, consider the following GADT of well-typed expressions: 
type +a expr = 

I Val : a — > a expr 

I Int : int — int expr 

I Thunk : V/3. /3 expr * (/3 — a) — >■ a expr 

I Prod : V/37. /3 expr * 7 expr — > (/? * 7) expr 

Is it safe to say that expr is covariant in its type parameter? It turns out that, 
using the subtyping relation of the OCaml type system, the answer is "yes". But, 
surprisingly to us, in a type system with a top type T, the answer would be "no". 

The aim of this article is to present a sound and complete criterion to check 
soundness of parameter variance annotations, for use in a type-checker. We also 
discuss the apparent fragility of this criterion with respect to changes to the sub- 
typing relation {e.g. the presence or absence of a top type, private types, etc.), and 
a different, more robust way to combine GADT and subtyping. 

Examples 

Let us first explain why it is reasonable to say that a expr is covariant. Informally, 
if we are able to coerce a value of type a into one of type a' (we write (v :> a') to 
explicitly cast a value v of type a to a value of type a'), then we are also able to 
transform a value of type a expr into one of type a' expr. Here is some pseudo- 
cod^ for the coercion function: 

let coerce : a expr — )> a' expr = function 
I Val (v : a) -> Val (v :> a') 
I Int n -> Int n 

I Thunk l3 (h : (3 expr) (f : /3 a) -> 

Thunk /3 b (fun x -> (fx :> a')) 
I Prod /3 7 ((b, c) : expr * 7 expr) -> 

(* if P * J < a' , then a' is of the form 
P' * 7' with P < P' and 7 < 7' 

^The variables /3' and 7' of the Prod case are never really defined, only justified at the meta-level, 
making this code only an informal sketch. 
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Prod /3' 7' ((b :> expr) , (c :> 7'expr)) 

In the Prod case, we make an informal use of something we know about the OCaml 
type system: the supertypes of a tuple are all tuples. By entering the branch, we 
gain the knowledge that a must be equal to some type of the form (3 * ^. So from 
a < a' we know that /3 * 7 < a'. Therefore, a' must itself be a pair of the form 
(3' * j'. By covariance of the product, we deduce that 13 < (3' and 7 < 7'. This 
allows to conclude by casting at types f3' expr and 7' expr, recursively. 

Similarly, in the Int case, we know that a must be an int and therefore an 
int expr is returned. This is because we know that, in OCaml, no type is above 
int: if int < r, then r must be int. 

What we use in both cases is reasoning of the forn|l: "if T[l3] < a' , then I know 
that a' is of the form T[/3 ] for some /3 ". We call this an upward closure property: 
when we "go up" from a T[/3], we only find types that also have the structure 
of T. Similarly, for contravariant parameters, we would need a downward closure 
property: T is downward-closed if T[(3] > a' entails that a' is of the form T[(3 ]. 

Before studying a more troubling example, we define the classic equality type 
(a,/3) eq, and the corresponding casting function cast : \/a(3.{a, (3) eq — > a — /?: 

type (a, (3) eq = 

I Refl : V7. (7, 7) eq 

let cast (eqab : (a, (3) eq) : a — /3 = 
match eqab with 

I Refl -> (fun x -> x) 

Notice that it would be unsoun to define eq as covariant, even in only one param- 
eter. For example, if we had type (+0;, —(3) eq, from any cr < r we could subtype 
(cr, a) eq into (r, cr) eq, allowing to cast any value of type r back into one of type 
cr, which is unsound in general. 

As a counter-example, the following declaration is incorrect: the type a t cannot 
be declared covariant. 

type +a t = 

I K : < m : int > ^ < m : int > t 
let V = (K (object method m = 1 end) :> < > t) 

This declaration uses the OCaml object type < m : int >, which qualifies objects 
having a method m returning an integer. It is a subtype of object types with fewer 
methods, in this case the empty object type < >, so the alleged covariance of t, if 
accepted by the compiler, would allow us to cast a value of type < m : int > t 
into one of type < > t. However, from such a value, we could wrongly deduce 
an equality witness (< >, <m : int>) eq that allows to cast any empty object of 
type < > into an object of type < m : int >, but this is unsound, of course! 

let get_eq : a t — > {a, < m : int >) eq = function 
I K _ -> Refl (* locally a ^ < m : int > *) 

let wrong : < > -> < m : int > = 

let eq : (< >, < m : int >) eq = get_eq v in 
cast eq 

It is possible to reproduce this example using a different feature of the OCaml 
type system named private type ahhreviatioi^ a module using a type type t = r 
internally may describe its interface as type t = private r. This is a compromise 

^We write T[/3] for a type expression T that may contain free occurrences of variables /3 and 
T\a\ for the simultaneous substitution of a for /3 in T. 
^This counterexample is due to Jeremy Yallop. 
''This counterexample is due to Jacques Garrigue. 
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between a type abbreviation and an abstract type: it is possible to cast a value 
of type t into one of type r, but not, conversely, to construct a value of type t 
from one of type r. In other words, t is a strict subtype of r: we have t < r 
but not t > T. Take for example type file_descr = private int: this semi- 
abstraction is useful to enforce invariants by restricting the construction of values 
of type f ile_descr, while allowing users to conveniently and efficiently destruct 
them for inspection at type int. 

Unsound GADT covariance declarations would defeat the purpose of such pri- 
vate types: as soon as the user gets one element of the private type, she could forge 
values of this type, as illustrated by the code below, 
module M = struct 

type file_descr = int 

let stdin = 

let open = . . . 
end : sig 

type file_descr = private int 

val stdin : file_descr 

val open : string -> (file_descr, error) sum 
end 

type +a t = 

I K : priv -> M.file_descr t 

let get_eq : a t -> (a, M . f ile_descr) eq = function 
I K _ -> Refl 

let forge : int -> M.file_descr = 

fun (x : int) -> cast (get_eq p) M. stdin 

The difference between the former, correct Prod case and those two latter sit- 
uations with unsound variance is the notion of upward closure. The types a * /3 
and int used in the correct example were upward-closed. On the contrary, the 
private type M.f ile_descr has a distinct supertype int, and similarly the object 
type < m : int > has a supertype < > with a different structure (no method m) . 

In this article, we formally show that these notions of upward and downward- 
closure are the key to a sound variance check for GADT. We start from the formal 
development of Simonet and Pottier |SP07j . which provides a general soundness 
proof for a language with subtyping and a very general notion of GADT express- 
ing arbitrary constraints — rather than only type equalities. By specializing their 
correctness criterion, we can express it in terms of syntactic checks for closure and 
variance, that are simple to implement in a type-checker. 

The problem of non-monotonicity 

There is a problem with those upward or downward closure assumptions: while they 
hold in core ML, with strong inversion theorems, they are non-monotonic properties: 
they are not necessarily preserved by extensions of the subtyping lattice. For exam- 
ple, OCaml has a concept of private types: a type specified by type t = private r 
is a new semi-abstract type smaller than r (t < r but t ^ r), that can be defined 
a posteriori for any type. Hence, no type is downward-closed forever. That is, for 
any type r, a new, strictly smaller type may always be defined in the future. 

This means that closure properties of the OCaml type system are relatively 
weak: no type is downward-closecj] (so instantiated GADT parameters cannot be 



^Except types that are only defined privately in a module and not exported: they exist in 
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contravariant), and arrow types are not upward-closed as their domain should be 
downward-closed. Only purely positive algebraic datatypes are upward-closed. The 
subset of GADT declarations that can be declared covariant today is small, yet, we 
think, large enough to capture a lot of useful examples, such as a expr above. 

Giving back the freedom of subtyping 

It is disturbing that the type system should rely on non-monotonic properties: if 
we adopt the correctness criterion above, we must be careful in the future not to 
enrich the subtyping relation too much. 

Consider private types for example: one could imagine a symmetric concept 
of a type that would be strictly above a given type r; we will name those types 
invisible types (they can be constructed, but not observed). Invisible types and 
GADT covariance seem to be working against each other: if the designer adds one, 
adding the other later will be difficult. 

A solution to this tension is to allow the user to locally guarantee negative 
properties about subtyping (what is not a subtype) , at the cost of selectively aban- 
doning the corresponding flexibility. Just as object-oriented languages have final 
classes that cannot be extended any more, we would like to be able to define some 
types as public (respectively visible), that cannot later be made private (resp. 
invisible). Such declarations would be rejected if the defining type already has 
subtypes {e.g. an object type), and would forbid further declarations of types below 
(resp. above) the defined type, effectively guaranteeing downward (resp. upward) 
closure. Finally, upward or downward closure is a semantic aspect of a type that 
we must have the freedom to publish through an interface: abstract types could 
optionally be declared public or visible. 

Another approach: subtyping constraints 

Getting fine variance properties out of GADT is difficult because they correspond 
to type equalities which, to a first approximation, use their two operands both 
positively and negatively. One way to get an easy variance check is to encourage 
users to change their definitions into different ones that are easier to check. For 
example, consider the following redefinition of a expr (in a speculative extension 
of OCaml with subtyping constraints): 

type +a expr = 
I Val : \/a.a — >■ a expr 
I Int : VQ:[Q:>int].int — a expr 
I Thunk : V/3. /3 expr * {/3 a) ^ a expr 
I Prod : \fa/3j[a>/3 * 7]. (/3 expr * 7 expr) — >■ a expr 

It is now quite easy to check that this definition is covariant, since all type equalities 
a = Ti[f3] have been replaced by inequalities a > Ti[/3] which are preserved when 
replacing a by a subtype a' > a — we explain this more formally in ^4.31 This vari- 
ation on GADT, using subtyping instead of equality constraints, has been studied 
by Emir et al [EKRY06] in the context of the Cfl programming language. 

But isn't such a type definition less useful than the previous one, which had a 
stronger constraint? We will discuss this choice in more detail in ^4.31 



a "closed world" and wc can check, for example, that they are never used in a private type 
definition. 
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On the importance of variance annotations 

Being able to specify the variance of a parametrized datatype is important at ab- 
straction boundaries: one may wish to define a program component relying on an 
abstract type, but still make certain subtyping assumptions on this type. Variance 
assignments provide a framework to specify such a semantic interface with respect 
to subtyping. When this abstract type dependency is provided by an encapsulated 
implementation, the system must check that the provided implementation indeed 
matches the claimed variance properties. 

Assume the user specifies an abstract type 

module type S = sig 
type {+a) collection 
val empty : unit -> a collection 

val app : a collection -> a collection -> a collection 
end 

and then implements it with linked lists 

module C : S = struct 
type +a collection = 
I Nil of unit 

I Cons of a * a collection 
let empty () = Nil () 
end 

The type-checker will accept this implementation, as it has the specified variance. 
On the contrary, 

type +a collection = (a list) ref 
let empty () = ref [] 

would be rejected, as ref is invariant. In the following definition: 
let nil = C. empty () 

the right hand-side is not a value, and is therefore not generalized in presence of the 
value restriction; we get a monomorphic type, ?a t, where ?a is a yet- undetermined 
type variable. The relaxed value restriction |Gar04j indicates that it is sound to 
generalize 7a, as it only appears in covariant positions. Informally, one may unify 
7a with _L, add an innocuous quantification over a, and then generalize Va.-L t 
into Va.a t by covariance — assuming a lifting of subtyping to polymorphic type 
schemes. 

The definition of nil will therefore get generalized in presence of the relaxed 
value restriction, which would not be the case if the interface S had specified an 
invariant type. 

Related work 

When we encountered the question of checking variance annotations on GADT, we 
expected to find it already discussed in the literature. The work of Simonet and 
Pottier |SP07j is the closest we could find. It was done in the context of finding 
good specification for type inference of code using GADT, and in this context it is 
natural to embed some form of constraint solving in the type inference problem. 
From there, Simonet and Pottier generalized to a rich notion of GADT defined over 
arbitrary constraints, in presence of a subtyping relation, justified in their setting 
by potential applications to information flow checking. 

They do not describe a particular type system, but a parametrized framework 
HMG(X), in the spirit of the inference framework IIM(X). In this setting, they prove 
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a general soundness result, applicable to all type systems which satisfy their model 
requirements. We directly reuse this soundness result, by checking that we respect 
these requirements and proving that their condition for soundness is met. This 
allows us to concentrate purely on the static semantics, without having to define 
our own dynamic semantics to formulate subject reduction and progress results. 

Their soundness requirement is formulated in terms of a general constraint en- 
tailment problem involving arbitrary constraints. Specializing this to our setting is 
simple, but expressing it in a form that is amenable to mechanical verification is sur- 
prisingly harder — this is the main result of this paper. Furthermore, at their level of 
generality, the design issues related to subtyping of GADT, in particular the notion 
of upward and downward-closed type constructors, were not apparent. Our article 
is therefore not only a specialized, more practical instance of their framework, but 
also raises new design issues. 

The other major related work, by Emir, Kennedy, Russo and Yu }EKRY06] . 
studies the soundness of having subtyping constraints on classes and methods of 
an object-oriented type system with generics (parametric polymorphism). Previous 
work [KR05j had already established the relation between the GADT style of having 
type equality constraints on data constructors and the desirable object-oriented 
feature of having type equality constraints on object methods. This work extends 
it to general subtyping constraints and develops a syntactic soundness proof in the 
context of a core type system for an object-oriented languages with generics. 

The general duality between the "sums of data" prominent in functional pro- 
gramming and "record of operations" omnipresent in object-oriented programming 
is now well-understood. Yet, it is surprisingly difficult to reason on the correspon- 
dence between GADT and generalized method constraints; an application that is 
usually considered to require GADT in a functional style (for example a strongly- 
typed eval a expr datatype and its associated eval function) is simply expressed 
in idiomatic object-oriented style without specific constraintfQ, while the simple 
flatten : Va, a list list — > a list requires an equality or subtyping constraint 
when expressed in object-oriented style. 

These important differences of style and applications make it difficult to compare 
our present work with this one. Our understanding of this system is that a subtyping 
constraint of the form X < Y is considered to be a negative occurrence of X, 
and a positive occurrence of Y; this means that equality constraints (which are 
conjunctions of a (<) constraint and a (>) constraints) always impose invariance 
on their arguments. Checking correctness of constraints with this notion of variance 
is simpler than with our upward and downward-closure criterion, but also not as 
expressive. It corresponds, in our work, to the idea of GADT with subtyping 
constraint mentioned in the introduction and that we detail in i J4.3l 

The design trade-off in this related work is different from our setting; the reason 
why we do not focus on this solution is that it requires explicit annotations at the 
GADT definition site, and more user annotations in pattern matching in our system 
where subtyping is explicitly annotated, while convertibility is implicitly inferred 
by unification. On the contrary, in an OOP system with explicit constraints and 
implicit subtyping, this solution has the advantage of user convenience. 

We can therefore see our present work as a different choice in the design space: we 
want to allow a richer notion of variance assignment for type equalities, at the cost a 
higher complexity for those checks. Note that the two directions are complementary 
and are both expressed in our formal framework. 



"There is a relation between this way of writing a strongly typed eval function and the "finally 
tagless" approach |Kisl that is known to require only simple ML types. 
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(T2 < (73 



b < c 



a > a' T <t' 



a < a 



0-1 < (T3 



b < c 



(T — 5- T < cr' r' 



cr < cr' T <t' 



type t 



(7 * r < cr' * t' 



t < ct' t 



Figure 1: Subtyping relation 



2 A formal setting 

We define a core language for Algebraic Datatypes (ADT) and, later, Generalized 
Algebraic Datatypes (GADT), that is an instance of the parametrized HMG(X) 
system of Simonet and Pottier |SP07j . We refine their framework by using variances 
to define subtyping, but rely on their formal description for most of the system, 
in particular the static and dynamic semantics. We ultimately rely on their type 
soundness proof, by rigorously showing (in the next section) that their requirements 
on datatype definitions for this proof to hold are met in our extension with variances. 

2.1 Atomic subtyping 

Our type system defines a subtyping relation between ground types, parametrized 
by a reflexive transitive relation between base constant types (int, bool, etc.). 
Ground types consist of a set of base types b, function types ri T2, product types 
Ti * T2, and a set of algebraic datatypes a t. (We write a for a sequence of types 
(CTi)ig/.) We use prefix notation for datatype parameters, as is the usage in ML. 
Datatypes may be user-defined by toplevel declarations of the form: 



This is a disjoint sum: the constructors Kc represent all possible cases and each 
type T'^\a] is the domain of the constructor Kc. Applying it to an argument e of a 
corresponding ground type t[ct] constructs a term of type a t. Values of this type 
are deconstructed using pattern matching clauses of the form Kc x — > e, one for each 
constructor. 

The sequence va is a binding list of type variables at along with their variance 
annotation Vi, which is a marker among the set {+, — , =, n}. We may associate a 
relation a relation (^^) between types to each variance v: 

• -<+ is the covariant relation (<); 

• is the contravariant relation (>), the symmetric of (<); 

• ^= is the invariant relation {—), defined as the intersection of (<) and (>); 

• ^1x1, is the irrelevant relation (x), the full relation such that (7 n r holds for 
all types a and r. 

Given a reflexive transitive relation (^) on base types, the subtyping relation on 
ground types (<) is defined by the inference rules of Figure [1] which, in particular, 
give their meaning to the variance annotations va. The judgment type va t simply 
means that the type constructor t has been previously defined with the variance 
annotation va. Notice that the rules for arrow and product types can be subsumed 
by the rule for datatypes, if one consider them as special datatypes (with a specific 



type va t = 



I Ki of Ti[a] 



I K„ of T"[a] 
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dynamic semantics) of variance (— , +) and (+, +), respectively. For this reason, the 
following definitions will not explicitly detail the cases for arrows and products. 

Finally, it is routine to show that the rules for reflexivity and transitivity are 
admissible, by pushing them up in the derivation until the base cases b ^ c, where 
they can be removed as (sj) is assumed to be reflexive and transitive. Removing 
reflexivity and transitivity provides us with an equivalent syntax-directed judgment 
having powerful inversion principles: if ct t < ct' t and type va t, then one can 
deduce that for each i, ai -<,,. a[. 

Wo insist that o\ir equality relation (=) is here a derived concept, defined from 
the subtyping relation (<) as the "equiconvertibility" relation (< n >); in partic- 
ular, it is not defined as the usual syntactic equality. If we have both hi ^ 62 and 
61 ^ 62 in our relation on base types, for two distinct base types 61 and 62, we have 
61 = 62 as types, even though they are syntactically distinct. This choice is inspired 
by the previous work of Simonet and Pettier. 

On the restriction of atomic subtyping The subtyping system demonstrated 

above is called "atomic" . If two head constructors are in the subtyping relation, 
they are either identical or constant (no parameters). Structure-changing subtyping 
occurs only at the leaves of the subtyping derivations. 

While this simplifies the meta-theoretic study of the subtyping relation, this is 
too simplifying for real- world type systems that use non-atomic subtyping relations. 
In our examples using the OCaml type system, private type were a source of non- 
atomic subtyping: if you define type a t2 = private a ti, the head constructors 
ti and t2 are distinct yet in a subtyping relation. If we want to apply our formal 
results to the design of such languages, we must be careful to isolate any assumption 
on this atomic nature of our core formal calculus. 

The aspect of non-atomic subtype relations we are interested in is the notion of 
w-closed constructor. We have used this notion informally in the first section (in 
OCaml, product types are +- closed); we now defined it formally. 

Definition 1 (Constructor closure) A type constructor S t is w-closed if, for 
any type sequence a and type t such that a t r hold, then t is necessarily equal 
to a' t for some a' . 

In our core calculus, all type constructors are ?;-closed for any 1; ^ k , but we will 
still mark this hypothesis explicitly when it appears in typing judgments; this let 
the formal results be adapted more easily to a non-atomic type system. 

It would have been even more convincing to start from a non-atomic subtyping 
relation. However, the formal system of Simonet and Pottier, whose soundness 
proof we ultimately reuse, restricts subtyping relations between (G)ADT type to 
atomic subtyping. We axe confident their proof (and then our formal setting) can 
be extended to cover the non-atomic case, but we have left this extension to future 
work. 

2.2 The algebra of variances 

If we know that at < ct' t, that is ct t -<+ ct' t, and the constructor t has variable 
zJa, an inversion principle tells us that for each i, Ui a[. But what if we only 
know at ^„ ct' t for some variance u different from (-|-)? If u is (— ), we get the 
reverse relation ai o'i- If u is (x), we get ai m a[, that is, nothing. This outlines 
a composition operation on variances u.Vi, such that if t ^„ ct' t then ai -<u.Vi 
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holds. It is defined by the following table: 



= + — N w 








XI 






+ 
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N 


N 


N 


N 



V 

This operation is associative and commutative. Such an operator, and the algebraic 
properties of variances explained below, have already been used by other authors, 
for example |Abe06j . 

There is a natural order relation between variances, which is the coarser-than 
order between the corresponding relations: v < w ii and only if (^t,) 3 (^m); i.e. 
if and only if, for all a and r, a r implies a -<v This reflexive, partial order 
is described by the following lattice diagram: 




That is, all variances are smaller than — and bigger than xi . 

From the order lattice on variances we can define join V and meet A of variances: 
V W w is the biggest variance such that v W w < v and v \/ w < v; conversely, v Aw 
is the lowest variance such that v < v Aw and w < v Aw. Finally, the composition 
operation is monotonous: ii v < v' then w.v < w.v' (and v.w < v' .w). 

We will frequently manipulate vectors va, of variable associated with variances, 
which corrcpond to the "context" F of a type declaration. We extend our operation 
pairwise on those contexts: F V F' and F A F', and the ordering between contexts 
F < F'. We also extend the variance-dependent subtyping relation (^i;), which 
becomes an order (-<r) between vectors of type of the same length: a -<voi'S' holds 
when for all i we have Ui -<y^ a[. 

2.3 Variance assignment in ADTs 

A counter-example To have a sound type system, some datatype declarations 
must be rejected. Assume (only for this example) that we have two base types 
int and bool such that bool ^ int and int ^ bool. Consider the following type 
declaration: 

type (+a,+/?)t = 
I Fun of a 13 

If it were accepted, we could build type the following program that deduces from 
the {+0i) variance that (bool, bool) t < (int, bool) t; that is, we could turn the 
identity function of type bool — > bool into one of type int — > bool and then turns 
an integer into a boolean: 

let three_as_bool : bool = 

match (Fun (fun x -> x) : (bool, bool) t :> (int, bool) t) with 
I Fun (danger : int — > bool) -> danger 3 



'^The reason for this order reversal is that the relations occur as hypotheses, in negative position, 
in definition of subtyping: if we have v <w and type va t, it is safe to assume type wa t: a ^u, a' 
implies a -<y cr' , which implies a t < a' t. One may also see it, as Abel notes, as an "information 
order": knowing that a ^-|- r "gives you more information" than knowing that a t, therefore 

N < +. 
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vc-Var vc-Constr 

wa S r w > V r h type wa t Vz, F h at : v.Wi 

ri-Q;:v T \- a t : V 

Figure 2: Variance assignment 



A requirement for type soundness We say that the type type va t defined 
by the constructors (Kc of r'^[S])cec is well-signed if 

Vc e C, Vct, Vct', t < ct' t =^ T^iw] < t^[ct'] 

The definition of (+«, +/3) t is not well-signed because we have (_L, _L) t < (int, _L) t 
according to the variance declaration, but we do not have the corresponding con- 
clusion (int ^ ±) < (± ±). 

This is a simplified version, specialized to simple algebraic datatypes, of the 
soundness criterion of Simonct and Pottier. They proved that this condition is 
sufficient for soundness: if all datatype definitions accepted by the type-checker 
are well-signed, then both subject reduction and progress hold — for their static and 
dynamic semantics, using the subtyping relation (<) we have defined. 

A judgment for variance assignment When reformulating the well-signedness 
requirement of Simonet and Pottier for simple ADT, in our specific case where the 
subtyping relation is defined by variance, it becomes a simple check on the variance 
of type definitions. Our example above is unsound as its claims a covariant while 
it in fact appears in negative position in the definition. 

In the context of higher-order subtyping jAbe06j . where type abstractions are 
first-class and annotated with a variance (Aua.r), it is natural to present this check 
as a kind checking of the form T h t : k, where F is a context va of type variables 
associated with variances. For example, if +a \- t : -k is provable, it is sound 
to consider a covariant in r. In the context of a simple first-order monomorphic 
type calculus, this amounts to a monotonicity check on the type r as defined by 
[EKRY06] . Both approaches use judgments of a peculiar form where the context 
changes when going under a type constructor: to check T \- a ^ t, one checks 
F h r but (F/— ) h (7, where F/— reverses all the variances in the context F (turns 
(— ) into (+) and conversely). Abel gives an elegant presentation of this inversion 
/ as an algebraic operation on variances, a quasi-inverse such that u/v < w if and 
only if w < w.v. This context change is also reminiscent of the context resurrection 
operation of the literature on proof irrelevance (in the style of [PfeOlj for example). 

We chose an equivalent but more conventional style where the context of sub- 
derivation does not change: instead of a judgment F h r that becomes (F/u) h cr 
when moving to a subterm of variance u, we define a judgment of the form T \- t : v, 
that evolves into F h cr : (v.u). The two styles are equally expressive: our judgment 
F h T : u holds if and only if (F/w) h r holds in Abel's system — but we found 
that this one extends more naturally to checking decomposability, as will later be 
necessary. The inference rules for the judgment F h r : w are defined on Figure [H 

A semantics for variance assignment This syntactic judgment T \~ t : v 
corresponds to a semantics property about the types and context involved, which 
formalizes our intuition of "when the variables vary along F, the expression r varies 
along w". We also give a few formal results about this judgment. 



'It turns out that this condition is not necessary and can be slightly weakened: we will discuss 
that later 
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Definition 2 (Interpretation of the variance ciiecking judgment) 

We write |r h t : i;| for the property: Va,a' , a ct' => t[ct] 

Lemma 1 (Correctness of variance checking) T h t : v is provable if and only 
iflT^T-.vj holds. 

Proof : 

Soundness: T \- t : v implies |r h r : u] By induction on the derivation. In 
the variable case this is direct. In the a t case, for p, p' such that p 7>\ we get 
Vi, (Ji[p] ^v.wi <^i[p'] by inductive hypothesis, which allows to conclude, by definition 
of variance composition, that (ct t)[p] ^„ (ct t)[p']. 

Completeness: |r h r : u] implies T \- t : v By induction on t; in the variable case 
this is again direct. In the a t case, given p -<r p' such that {W t)[p] (a t)[p'] 
we can deduce by inversion that for each variable of variance Wi in r[a] we 
have cr4p] which allows us to inductively build the subderivations 

r h (Ti : v.Wi. ■ 

Lemma 2 (Monotonicity) // F h r : w is provable and F < F' then T' h t : v is 
provable. 

Lemma 3 // F h r : w and F' h r : w both hold, then (F V F') h r : u aZso holds. 

Corollary 1 (Principality) For any type r and any variance v, there exists a 
minimal context A such that A h t : w holds. That is, for any other context F such 
that F h T : w, we have A < F. 

Inversion of subtyping We have mentioned in 12.11 the inversion properties of 
our subtyping relation. From ct t < ct' t we can deduce subtyping relations on the 
type parameters Ci, cr^. This can be generalized to any type expression r[S]: 

Theorem 1 (Inversion) For any type T[a], variance v, and type sequences'^ and 
ct', the subtyping relation t[ct] -<y '''\^'] holds if and only if the judgment T \- t : v 
holds for some context F such that ct'. 

Proof: The reverse implication, is a direct application of the soundness of the 
variance judgment. 

The direct implication is proved by induction on T[a]. The variable case is direct: if 
OL[a\ OL[a'] holds then for F equal to {va) we indeed have va\- a : v and a c'. 

In the r t case, we have that (r <v {j Suppose the variance of 

a t is wa: by inversion on the head type constructor t we deduce that for each 
i, Ti\a] ^v.wi 'Ti\a']. Our induction hypothesis then gives us a family of contexts 
(Fi)ig/ such that for each i we have F^ h : v.Wi. Furthermore, a -<Ti ct' holds for 
all Fi, which means that a ^Aig/T; bs' . Let's define F as Aig/F^. By construction 
we have F > Fj, so by monotonicity (Lemma [2]) we have F h : v.Wi for each i. 
This allows us to conclude F h r t : w as desired. ■ 

Note that this would work even for type constructors that are not u-closed: 
we are not comparing a T\a] to any type r', but to a type t[ct'] sharing the same 
structure — the head constructors are always the same. 

For any given pair a, ct' such that T\a] -<y we can produce a context F such 
that CT ct'. But is there a common context that would work for any pair? Indeed, 
that is the lowest possible context, the principal context F such that F h t : u. 
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Corollary 2 (Principal inversion) // A is principal for A \- t : v, then for any 
type sequences a and a' , the subtyping relation T\a] -<y T\a'] implies a o"'. 

Proof : Let A be the principal context such that Ah t : v holds. For any a, a' such 
that t[ct] ^t, t[ct'], by inversion (Theorem [Ij we have some P such that T \- t : v 
and W ct'. By definition of A, A < F so -<a ct' also holds. That is, W -<a ct' 
holds for any a, a' such that T[a] -<y t[ct']. ■ 

Checking variance of type definitions We have all the machinery in place to 
explain the checking of ADT variance declarations. The well-signedness criterion of 
Simonet and Pettier gives us a general semantic characterization of which definitions 
are correct: a definition type va t = (Kc of (T'^[a])c£C is correct if, for each 
constructor c, we have: 

By inversion of subtyping, a t < W' t implies cr^ (j\ for all i. Therefore, it 
suffices to check that: 

Va,Va' , (yi,cFi ct-) T[a] < T[a'] 

This is exactly the semantic property corresponding to the judgment va h r : (+)! 
That is, we have reduced soundness verification of an algebraic type definition to a 
mechanical syntactic check on the constructor argument type. 

This syntactic criterion is very close to the one implemented in actual type 
checkers, which do not need to decide general subtyping judgments — or worse solve 
general subtyping constraints — to check variance of datatype parameters. Our aim 
is now to find a similar syntactic criterion for the soundness of variance annotations 
on guarded algebraic datatypes, rather than simple algebraic datatypes. 

2.4 Variance annotations in GADT 

A general description of GADT When used to build terms of type a t, a 
constructor K of r behaves like a function of type Va.{T — > a t). Remark that the 
codomain is exactly a t, the type t instantiated with parametric variables. GADT 
arise by relaxing this restriction, allowing to specify constructors with richer types 
of the form Va.{T ^ a t). See for example the declaration of constructor Prod in 
the introduction: 

I Prod : V/37. /3 expr * 7 expr — ;> (/3 * 7) expr 

Instead of being just a expr, the codomain is now (/3*7) expr. We moved from sim- 
ple algebraic datatypes to so-called generalized algebraic datatypes. This approach 
is natural and convenient for the users, so it is exactly the syntax chosen in lan- 
guages with explicit GADT support, such as Haskell and OCaml, and is reminiscent 
of the inductive datatype definitions of dependently typed languages. 

However, for formal study of GADT, a different formulation based on equality 
constraints is preferred. The idea is that we will force again the codomain to be 
exactly a expr, but allow additional type equations such as a = /3 * 7 in this 
example; 

I Prod : Va. V/37 [a = /3 * 7] . /3 expr * 7 expr — a expr 

This restricted form justifies the name of guarded algebraic datatype. The V/37[I?].t 
notation, coming from Simonet and Pottier, is a constrained type scheme: /3,7 may 
only be instantiated with type parameters respecting the constraint D. Note that, 
as /? and 7 do not appear in the codomain anymore, we may equivalently change 
the outer universal into an existential on the left-hand side of the arrow: 
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I Prod : Va. (3/37 [a — P expr * 7 expr) — > a expr 

In the general case, a GADT definition for a t is composed of a set of constructor 
declarations, each of the form: 

I K : Va. {3p[a = a\P]].T[a,'^]) ^at 

or, reusing the classic notation, 

I K of 3p\a = a\P]].T\a,'P] 

Without loss of generality, we can conveniently assume that the variables a do 
not appear in the parameter type r anymore: if some appears in r, one may 
always pick a fresh existential variable /3, add the constraint a = P io D, and 
consider T[/3/ck]. Let us re-express the introductory example in this form, that is, 
K of 3^[a = CT[^]].Tp]: 

type a expr = 

I Val of 3(3[a = /3]./3 

I Int of [a — int]. int 

I Thunk of 3/37 [a — l\ - P expr * (/3 — > 7) 

I Prod of 3/37 [a = /3 * 7]. /3 expr * 7 expr 

If all constraints between brackets are of the simple form — j3i (for distinct 
variables ai and /3i), as for the constructor Thunk, then we have a constructor with 
existential types as described by Laiifer and Odersky |OL92| . If furthermore there 
are no other existential variables than those equated with a type parameter, as in 
the Val case, we have an usual algebraic type constructor; of course the whole type 
is "simply algebraic" only if each of its constructors is algebraic. 

In the rest of the paper, we extend our former core language with such guarded 
algebraic datatypes. This impacts the typing rules (which are precisely defined 
in Simonet and Pottier), but not the notion of subtyping, which is defined on 
(GADT) type constructors with variance type va t just as it previously was on 
simple datatypes. What needs to be changed, however, is the soundness criterion 
for checking the variance of type definitions. 

The correctness criterion Simonet and Pottier |SP07j define a general frame- 
work IIMG(X) to study type systems with GADT where the type equalities in 
bounded quantification are generalized to an arbitrary constraint language. They 
make few assumptions on the type system used, mostly that it has function types 
cr — > T, user-definable (guarded) algebraic datatypes a t, and a subtyping relation 
a < T (which may be just equality, in languages without subtyping). 

They use this general type system to give static semantics (typing rules) to a 
fixed untyped lambda-calculus equipped with datatype construction and pattern 
matching operations. They are able to prove a type soundness result under just 
some general assumptions on the particular subtyping relation (<). Here are the 
three requirements to get their soundness result: 

1. Incomparability of distinct types: for all types ti, T2, ct, ct' and distinct datatypes 
a t, a' t', the types (ti — r2), Ti*T2,at and ct' t' must be pairwise incompa- 
rable (both ^ and ^) — this is where our restriction to an atomic subtyping 
relation, discussed in ^2.11 comes from. 

2. Decomposability of function and product types: if ti — > r2 < cti — > cr2 (re- 
spectively Ti * T2 < cTi * (T2), we must have ti > ai (rcsp. ti < cti) and 

T2 < (J2- 
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3. Decomposability of datatype^: for each datatype a t and all type vectors W 
and a' such that ct t < ct' t, we must have (E1/3[D[ct]]t) < {3f3[D[a']]T) for 
each constructor K of 3/3 [£)[/?, a]]. t[/3]. 

Those three criteria are necessary for the soundness proof. We will now explain 
how variance of type parameters impact those requirements, that is, how to match 
a GADT implementation against a variance specification. With our definition of 
subtyping based on variance, and the assumption that the datatype va t we are 
defining indeed has variance va, is the GADT decomposability requirement (item 
3 above) satisfied by all its constructors? If so, then the datatype definition is 
sound and can be accepted. Otherwise, the datatype definition does not match the 
specified variance, and should be rejected by the type checker. 

3 Checking variances of GADT 

For every type definition, we need to check that the decomposability requirement of 
Simonet and Pettier holds. Remark that it is expressed for each GADT construc- 
tor independently of the other constructors for the same type: we can check one 
constructor at a time. 

Assume we check a fixed constructor K of argument type 3/3[£)[a, /?]]. r[/3]. Si- 
monet and Pottier prove that their requirement is equivalent to the following for- 
mula, which is more convenient to manipulate: 

Vct,ct',p, {at<a'tAD[a,p] =^ 3p' , D[a' ,p'] A t[p] < t[p']) (req-SP) 

The purpose of this section is to extract a practical criterion equivalent to this 
requirement. It should not be expressed as a general constraint satisfaction problem, 
but rather as a syntax-directed and decidable algorithm that can be used in a type- 
checker — without having to implement a full-blown constraint solver. 

A remark on the non-completeness Note that while the criterion |req-SP| is 
sound, it is not complete — even in the simple ADT case. 

For a constructor K of r of t, the justification for the fact that, under the 
hypothesis ct t < ct' t, we should have T[a] < t[ct'] is the following: given a value 
V of type t[ct], we can build the value K w at type a t, and coerce it to ct' t. We 
can then deconstruct this value by matching the constructor K, whose argument is 
of type t[ct']. But this whole computation, (match [K v :> a' t) with K x — > x), 
reduces to u, so for value reduction to hold we need to also have v : t[ct']. 

For this whole argument to work we need a value at type t[ct]. In fact, if the 
type t[ct] is not inhabited, it can fail to satisfy |req-SP| and still be sound: this 
criterion is not complete. See the following example in a consistent system with an 
uninhabited type _L: 

type +a t = 
I T of int 

I Empty of ± * (a ^ bool) 

Despite a occurring in a contravariant position in the dead Empty branch (which 
violates the soundness criterion of Simonet and Pottier) , under the assumption that 
the _L type really is uninhabited we know that this Empty constructor will never 
be used in closed code, and the contravariant occurrence will therefore never make 
a program "go wrong". The definition is correct, yet rejected by |req-SP1 which is 
therefore incomplete. 

^This is an extended version of the soundness requirement for algebraic datatypes: it is now 
formulated in terms guarded existential types 3/3[D]t rather than simple argument types t. 
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Deciding type inhabitation in the general case is a very complex question, which 
is mostly orthogonal to the presence and design of GADT in the type system. 
There is, however, one clear interaction between the type inhabitation question and 
GADT. If a GADT a t is instantiated with type variables a that satisfy none of 
the constraints D[a\ of its constructors K of 3/3[D].t, then we know that ct t is not 
inhabited. This is related to the idea of "domain information" that we discuss in 
the Future Work section (©. 

3.1 Expressing decomposability 

If we specialize |req-SP| to the Prod constructor of the a expr example datatype, 
i.e. Prod of 3/3"f[a — (3 * 7]/? expr * 7 expr, we get: 

Vcr,CT', pi,p2, 

(cr expr < a' expr A cr = pi * p2 =^ ^p'li p'21 {'^' P'l * P'2 ^ Pi * P2 < p'l * p'2)) 

We can substitute equalities and use the (assumed) covariance to simplify the 
subtyping constraint cr expr < a' expr into a < a': 

Vcr', pi,p2, (pi*P2<Cr' =^ Vl,P2> {(^' = p'l* p'2 ^ Pi < p'l A P2 < P2)) (1) 

This is the upward closure property mentioned in the introduction. This transfor- 
mation is safe only if any supertype a' of a product pi * p2 is itself a product, i.e. 
is of the form p'^ * p'2 for some p'l and p'2 . 

More generally, for a type F h cr and a variance v, we are interested in a closure 
property of the form 

V(p : F),ct', cr[p] <v cr' =^ : F), a' = a[p'] 

Here, the context F represents the set of existential variables of the constructor (/? 
and 7 in our example). We can easily express the condition pi < p'l and p2 < p'2 on 
the right-hand side of the implication by considering a context F annotated with 
variances {+l3,+j), and using the context ordering (-<r)- Then, (|^) is equivalent 
to: 

V(p : F), a', a[p] a' =^ 3{p' : F), p P A (t' = a[p'] 

Our aim is now to find a set of inference rules to check decomposability; we will 
later reconnect it to |req-SP| In fact, we study a slightly more general relation, 
where the equality cr[p'] = a' on the right-hand side is relaxed to an arbitrary 
relation <j[p'] a': 

Definition 3 (Decomposability) Given a context F, a type expression cr[/3] and 
two variances v and v' , we say that a is decomposable under F from variance v to 
variance u', which we write F I h cr : d ^ u', if the property 

V(p : F),cr', a[p] ^„ a' =^ 3{p' : F), p -<r p' A cr[p'] a' 

holds. 

We use the symbol Ih rather than h to highlight the fact that this is just a logic 
formula, not the semantics criterion corresponding to an inductive judgment, nor a 
syntactic judgment — we will introduce one later in section [3.41 

Remark that, due to the positive occurrence of the relation in the proposition 
T T : V v' and the anti-monotonicity of ^r, this formula is "anti-monotonous" 
with respect to the context ordering F < F'. This corresponds to saying that we 
can still decompose, but with less information on the existential witness p'. 
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Lemma 4 (Anti-monotonicity) If T \\- t : v v' holds and T' < T, then 
T' \\- T : V ^ v' also holds. 

Our final decomposability criterion, given below in Figure [U requires both correct 
variances and a decomposability property, so it will be neither monotonous nor 
anti- monotonous with respect to the context argument. 
In the following subsections, we study the subtleties of decomposability. 

3.2 Variable occurrences 

In the Prod case, the type whose decomposability was considered is /3 * 7 (in the 
context /3,7). In this very simple case, decomposability depends only on the type 
constructor for the product. In the present type system, with very strong invertibil- 
ity principles on the subtyping relation, both upward and downward closures hold 
for products — and any other head type constructor. In the general case, we require 
that this specific type constructor be upward-closed. 

In the general case, the closure of the head type constructor alone is not enough 
to ensure decomposability of the whole type. For example, in a complex type 
expression with subterms, we should consider the closure of the type constructors 
appearing in the subterms as well. Besides, there are subtleties when a variable 
occurs several times. 

For example, while /3*7 is decomposable from (+) to (=), /?*/? is not: _Lh<± is an 
instantiation of (3*13, and a subtype of, e.g., int*bool, but it is not equal to (/3*/3)[7'] 
for any 7'. The same variable occurring twice in covariant position (or having one 
covariant and one invariant or contravariant occurence) breaks decomposability. 

On the other hand, two invariant occurrences are possible: /3 ref * /? ref is 
upward-closed (assuming the type constructor ref is invariant and upward-closed) : 
if (cr ref * a ref) < a', then by upward closure of the product, a' is of the form 
a'l * and by its covariance a ref < a'l and a ref < CTj- Now by invariance of 
ref we have a'l = a ref = a'2, and therefore a' is equal to a ref * a ref, which is 
an instancj^ of (3 ref * (3 ref. 

Finally, a variable may appear in irrelevant positions without affecting closure 
properties; f3 * {(3 irr) (where irr is an upward-closed irrelevant type, defined for 
example as type a irr = int) is upward closed: if ct* (ct irr) < a' , then cr' is of the 
form a'l * (cj irr) with a < a[ and a m a'2, which is equiconvertible to * {a[ irr) 
by irrelevance, an instance of /3 * (/3 irr). 

3.3 Context zipping 

The intuition to think about these different cases is to consider that, for any a', 
we are looking for a way to construct a "witness" a' such that t[ct'] — a' from the 
hypothesis t[ct] a' . When a type variable appears only once, its witness can be 
determined by inspecting the corresponding position in the type a' . For example 
in a * /3 < bool * int, the mapping a 1— bool,/3 1— )> int gives the witness pair 
bool, int. 

However, when a variable appears twice, the two witnesses corresponding to the 
two occurrences may not coincide. (Consider for example bool * int.) If a 

variable f3i appears in several invariant occurrences, the witness of each occurrence 
is forced to be equal to the corresponding subterm of t[ct], that is ai, and therefore 
the various witnesses are themselves equal, hence compatible. On the contrary, 
for two covariant occurrences (as in the /? * /? case), it is possible to pick a a' 
such that the two witnesses are incompatible — and similarly for one covariant and 



^"We use the term instance to denote the replacement of all the free variables of a type expression 
under context by closed types — not the specialization of an ML type scheme. 
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one invariant occurrence. Finally, an irrelevant occurrence will never break closure 
properties, as all witnesses (forced by another occurrence) are compatible. 

To express these merging properties, we define a "zip' operation vi X ^2, 
that formally expresses which combinations of variances are possible for several 
occurrences of the same variable; it is a partial operation (for example, it is not 
defined in the covariant-covariant case, which breaks the closure properties) with 
the following table: 



= + — N w 
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+ 
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The following lemma uses zipping to merge together the results of the decom- 
position of several subterms {(Ti)i into a "simultaneous decomposition". 

Definition 4 (Simultaneous decomposition) Given a context F, and families 
of type expressions ((7i)ig/ and variances {vi)i^i and (wQig/, we define the following 
"simultaneous closure property" F Ih [Ti : Vi ~~> w^)ie/ defined as : 

V(p : F), a', (Vi e /, a,[-p] a[) =^ 3(p' : F), p P A (Vi G /, a[) 

Lemma 5 (Soundness of zipping) Suppose we have families of type expressions 
{Ti[f5])i^i , contexts (Ti)i£i and variances {vi)i(zj and {v[)ii=i such that Xi^i^i exists 
and for all i we have both Fi h Ti : Vi and Ti Ih Ti : Vi v[. Then, we have 
(Aie/Tj) Ih (ctj : v[).iei ■ 



Proof : Without loss of generality, we can consider that there are only two type 
expressions Ti[j3\ and 72 and that the free variables (3 is reduced to a single 
variable 13. Let uJi,W2 be the respective variances of /3 in Fi,F2. We know that 
(wi X ^^'2) exists and is equal to the variance w of the variable (3 in Ti X^2- 

Our further assumptions are F^ h : Vi (1) and F^ Ih Ti : Vi w- (2) for i in {1, 2}. 
The expansion of (||) is: 

Vi e /, Vp,a^, T,[p] ^,„, a[ ^ 3p', p ^r, p' A T,[p'] <j[ (3) 

Our goal is to prove Fi X r2 Ih {<Ji : Vi ~^ v[)i(zi^ which is equivalent to: 

Vp,a', (V^ e /, {T,[p\ a[)) =^ 3p', p p' A Vz £ I,T,[p'] a[ (4) 

Assume given p and {a[,a'2) such that Ti[p] ^i,^ a[ (5) and T2[p] ^'2 (6)- 
Applying (^ with i equal to 1 and (||) ensures the existence of a p[ such that 
P -<wi Pi (7) and -<vi (^i (8). Similarly, there exists p'2 such that p ^^^2 P2 (9) 

and 12 [P2] o'2 (10). To establish (^, it remains to build a single p' that satisfies 
p -<w p' (11), Ti[p'] cr^ (12) and T2[p'] -<V2 ^'2 (13), simultaneously. We reason 
by case analysis on wi and W2 (restricted to the cases where the zip exists). 

If both wi and W2 are (k), we take either p'^ or p^ for: since w is x, we ([ll|) trivially 
holds; Furthermore, T[p'] = T[p'-^ = T[p'2] by irrelevance of Vi and (|l|); therefore 
(|l|) and <^ follow from §) and (^. 

If only one of the Wi is (n), we'll suppose that it is wi. We then take p2 for p' . Since 
><i 1 A '"^2 is W2 7 (0) follows from (||); Furthermore, r[p'] = T[p'i\ by irrelevance of 



^^The idea of context merging and the term "zipping" are inspired by Montagu and Remy |MR09| 
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sc-Triv sc-Var 

V > v' T \- T : V wa G F w = v 

T \- T : V v' T \- a : V ^ v' 



SC-CONSTR 

r h type wa t : u-closed F = Xi Vi, F,; h ai : v.Wi u'.Wi 

r\-Wt:v^v' 

Figure 3: Syntactic decomposablity 



vi and (Ij) while T2[p'] — r2[P2] holds by construction; Hence, as in the previous 
case, (|lf) and @ follow from (|) and 

Finally, if both wi and u;2 are (=), then (|^) and implies p\—p = p^- We take 
p for p' and all three conditions are obviously satisfied. ■ 

This lemma admits a kind of converse lemma stating completeness of zipping, that 
says that if F Ih (T^ : Vi v[)i holds, then F is indeed related to a zip of contexts 
{Ti)i that pairwise decompose each of the (T'i)i. However, the proof of completeness 
is more delicate and we prove it separately in ij3.5l 



3.4 Syntactic decomposability 

Equipped with the zipping operation, we introduce a judgment T h t : v ^ v' to 
express decomposability, syntactically, defined by the inference rules on Figure [3l 

We rely on the zip soundness (Lemma [5]) to merge sub-derivations into larger 
ones, so in addition to decomposability, the judgments simultaneously ensures that 
u is a correct variance for r under F. Actually, in order to understand the details 
of this judgment, it is quite instructive to compare it with the variance-checking 
judgment T \- t : v defined on Figure [21 

Rule Isc-VarI is very similar to Ivc-VarI except that the condition ui > f is 
replaced by a stronger equality w — v. The reason why the variance-checking judg- 
ment has an inequality u; > w is to make it monotonous in the environment — as 
requested by its corresponding semantics criterion (Definition [2]). Therefore, the 
condition w > v is, necessary for completeness — and admissible. On the contrary, 
the present judgment ensures, according the semantic criterion (Definition [5|), that 
both the variance is correct (monotonous in the environment) and the type is de- 
composable, a property which is anti-monotonic in the environment (Lemma |4]). 
Therefore, the semantics criterion |F h r : w ^ u'] is invariant in F and, corre- 
spondingly, the variable rule must use a strict equality. 

The most interesting rule is lsc-CoNSTRl It checks first that the head type con- 
structor is z;-closed (according to Definition [1]) ; then, it checks each subtype for 
decomposability from v to v' with compatible witnesses, that is, in an environment 
family F.^ that can be zipped into a unique environment F. 

In order to connect the syntactic and semantics versions of decomposability, we 
define the interpretation |F h r : w =^ w'| of syntactic decomposability. 

Definition 5 (Interpretation of syntactic decomposability) 

We write |F h t : d u'J for the conjunction of properties |F h t : and 
T \\- T : V v' . 

Note that our interpretation T h t : v ^ v' does not coincide with our previous 
decomposability formula T \\- t : v ^ v' , because of the additional variance-checking 
hypothesis that makes it composable. The distinction between those two notions of 
decomposition is not useful to have a sound criterion, but is crucial to be complete 
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with respect to the criterion of Simonet and Pottier, which imposes no variance 
checking condition. 

Lemma 6 (Soundness of syntactic decomposability) 

// the judgment T h t : v ^ v' holds, then JF h t : w t/]] is holds. 

Proof : The proof is by induction on the derivation P h t : u u' (1). Expanding 
JP h T : u ^ t;'], we must show both |P h r : or equivalently P h t : u (2) and 
T \\- T : V v' (3), which itself expands to: 

V(p : P),a', <j[p] ^„ a' =^ : P), p^rp' A <j[p'] o' 

Let p, t' be such that t[p] r'. We must exhibit a sequence p' such that p 
p' (4) and r[p'] r' (5). Cases where the derivation of (||) ends with Isc-TrivI and 
SC-Var cases are direct: take p and (. . . , cr', . ■ • ) for p', respectively. 

In the remaining cases, the derivation ends with Rule Isc-ConstrI and r is of the 
form CT t . 

• The w-closure assumption of the left premise ensures that r' is itself of the 
form ct' t for some sequence of closed types ct'. By inversion on the variance 
wa of the head constructor t, we deduce cri[p] <v.wi for all i (6). 

• The middle premise is the zipping assumption on the contexts P — Xiei Ti (7). 

• The right premises gives us subderivations P.; h ai : v.Wi v' .Wi. This implies 
Pi h (Ti : v.Wi, for all i, which implies T h a t : v, i.e. (|^). By induction 
hypothesis, this also implies P^ h ct^ : v.Wi (8) and Pi Ih ai : v.Wi v' .Wi for 
aU i (9). 

We may now apply zip soundness (Lemma [5]) with hypotheses (||) and (||), which 
gives us the simultaneous decomposition P Ih (cr^ : v.Wi v'.Wi)i^i. Expanding 
this property (Definition |4]) , we may apply to (|^) to get to get a witness p' such 
that both p' -<r p, i.e. our first goal (^), and (Vi G /, cri[p'] ^v'.wi <^'i), which implies 
(CTt)[p'] a' t, i.e. our second goal (||). ■ 

Completeness is the general case is however much more difficult and we only 
prove it when the right-hand side variance v' is (=). In other words, we take back 
the generality that we have introduced in ^3.11 when defining decomposability. The 
proof requires several auxiliary lemmas; it is the subject of the next subsection. 

3.5 Completeness of syntactic decomposability 

We first show a few auxiliary results that will serve in the proof of zip completeness, 
and later, to reconnect our closure-checking criterion (Definition [5]) with the full 
criterion of Simonet and Pottier ( |req-SPP . 

Lemma 7 (Intermediate value) Let T\a] he a type expression and pj^, P2, P3 
three type families such that T\pi] < t[p2] < t[p^] and pj^ -<r P3 holds for some P. 
Then, there exists a type family p2 such that both pj^ P2 P3 and r[p2] — t[p2] 
hold. 

Proof : We reuse the notations of the definition and assume t[pi] < r[p2] < t[p3] (1) 
and pi -<r P3 (2). We just have to exhibit p'2 such that both -<r P2 P3(3) 
and t[p2] = t[p2] (4) hold. Let A be the most general variance of r, i.e. the lowest 
context such that A h r : + (5) holds. By principal inversion (Corollary applied 
to (0) twice thanks to (^, we have pi -<a P2 P3 (6). 
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If A > r, the result is immediate, as -<r P2 /O3 follows from (|6|) by anti- 
monotonicity and both (|^) and hold when we take P2 fo^' P2- Otherwise, we 
reason on each variable of the context T independently. We may assume, w.l.o.g., 
that T is defined over a single free variable a, and F and A are single variances 
VA,vr with VA ^ iT- We reason by case analysis on the possible variances for 
{vA,vr), which are {(-,=), (+, -), (-,+), 

If vr is {—), the hypotheses (||) and (|l|) become pi = ps and r[pi] < t[/72] < t[pi], 
which implies r[pi] = t[p2]. Thus, taking pi for p'2 satisfies and (^. 

If VA is (x), then by irrelevant of N and (|^), we have t[pi] = t[p2]. Thus, taking 
pi for P2 satisfies (|^) and (^, as above. 

Finally, the cases (+, — ) and (— , +) are symmetric and we will only work out the 
first one, i.e. va is (+) and vr is (— ). From r[pi] < t[p3] (which follows from (1) 
by transitivity) and (||), we have pi pa, i.e. pi < ps. Since the hypothesis (2) 
becomes pi > pa, we have pi = ps ■ Then, taking pi for p'2, (^) trivially holds while 
(|) follows from (|l|). ■ 

The next lemma connects the monotonicity of the variance-checking judgment 
(checking variance at a lower context provides more information, and is therefore 
harder) and the anti-monotonicity of the decomposability formula (decomposing to 
a higher context provides more information, and is therefore harder): for a fixed 
type expression, the contexts at which you can check variance are higher than the 
contexts at which you can decompose. This property, however, only holds for non- 
trivial decomposability results (otherwise any context can decompose): we must 
decompose from a, v to a, v' that do not verify v > v' , and no variable of the typing 
context must be irrelevant. 

Lemma 8 Let T[a] be a type and v and v' be variances such that v ^ v' . If 
r Ih T : w w' and A is the most general context such that A h r : then, for each 
non-irrelevant variable a of A, we have T{a) < A.(a). 



Proof : We show that the F is lower than the lowest possible A, i.e. it is the most 
general context such that A \- t : v holds. Without loss of generality, we consider 
the case where r has a single, non-irrelevant variable a, and F and A are singleton 
contexts over a single variance, respectively wr and wa, with vua {^)- 

Therefore, we assume (wra) Ih t : w ~~> w' (1) and (wac^) \^ t : v (2). We prove that 
wt < WA- We actually show that for any pi,p2 such that pi P2 we also have 
Pi P2 (3). 

From pi P2, we can deduce r[pi] ^„ t[p2] (4). Applying (|]), we get a p' such 
that pi p' (5) and t[p'] t[p2] (6). We then reason by case analysis on t; ^ v' , 
considering the different cases {(n, __), (-I-, — ), (— , -I-), (__, =)}. 

If w is (i><), the most general wa is n, a case we explicitly ruled out: there is nothing 
to prove. 

If v' is {—), then (||) implies r[p'] — r[p2], and in particular p' = P2; our goal (||) 
follows from (||). 

If {v,v') is (+, — ) (we won't repeat the symmetric case (—,-!-)), then (Q) and (^ 
becomes t[pi] < t[p2] and t[p2] < t[p']. Given (^, the intermediate value lemma 
(Lemma[7]) ensures the existence of a p" such that p p" -<r p' and r[p"] = r[p2]. 
From there, we deduce p" = p2, our goal (||) follows from (||), as in the previous 
case. ■ 

Finally, the following auxiliary lemmas will be useful in the proof of complete- 
ness. 
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Lemma 9 // the principal variance w such that (wa) h T[a] : v holds is not the 
irrelevant variance N, then r[pi] = t[p2] implies pi ~ P2- 

Proof : Whatever v is, r[pi] = t[p2] implies both t[pi\ <y t[p2] and its converse 
''"[P2] t[pi]. This holds in particular for the principal variance w such that 
(wa) h T[a\ : v. Moreover, by principal inversion (Corrolary [2|) applied twice, we 
have both pi p2 and p2 ~<w pi- If w is distinct from ixi this implies pi — P2- ■ 

Lemma 10 //F Ih r : w ^ {—) holds for some T , and A is the most general context 
such that A h r : ti holds, then A Ih t : (=) also hold. 

Proof : Assume T \\- t : v ^ (=), i.e. 

\fpa\ t[p\ a' => 3p', p p' A t[p'] ^ a' (1) 
Assume that A is principal for A h r : w (2). We show A Ih r : w ^ (=), i.e. 

Vpa"', t[p] -<^, ct' =^ Elp', p p' A r[p'] = cr' (3) 

Let p, a' be such that have r[p] ^„ cr' (4). By there exists p' such that 
t\p'\ — a' (5). To prove (j^), it only remains to prove that p p' (6). Given 
the inequality becomes t[p] ^„ t[p'] (7). Then follows by principal inversion 
(Corrolary [5]) applied to (^, given (|]). ■ 

Lemma 11 Let A 6e the most general context such that A h r : u holds. If v is 
{—), then only variances (=) or (n) may appear in A. If v is N, then only (x) may 
appear in A. 

Proof : If V is (n), the context F with all variances set to n satifies T \- t : v (as 
[F h r : u]] holds). By principality we have A < F, so A also has only irrelevant 
variances as ( n ) is the minimal variance. 

If V is (=), we handle each variable of the context independently, that is we can 
assume, w.l.o.g., that r has only one variable a. So A is of the form {wa) and we 
know that for any p, p' such that p p' we have r[p] — r[p'] (1). If w is not (x), 
by lemma |9] applied to (|l|), we have p = p' for any p -<„, p', which means that w is 
(=). Summing up, we have shown that w is either (x) or {—). Reasoning similarly 
in the general case, any variance w of A is either (x) or (=). ■ 

We can now prove the converse of the zip soundness (Lemma[S|) that is the core of 
the future proof of completeness of the decomposability judgment F h r : w (=). 

Theorem 2 (Zip completeness) Given any context T , a family of type expres- 
sions {Ti\a\)i^i and a family of variances {viji^i, if the simultaneous decomposition 
F Ih [Ti : Vi (=))ie/ holds, then there exists a family of contexts {Tiji^i such that 
F < Xi£i and both F^ h : Vi and F,; Ih Ti : Vi {—) hold for all i. 
If furthermore T \- Ti : Vi holds for all i, then Xi^i Pi is precisely P. 

Proof: Let us assume the simultaneous decomposition F Ih (Ti : Vi {=))ieii which 
expands to: 



Vct',p, (Vi,T,[p] cr-) 



3p, p -<r p' A (Vi, T^[p'] = cr-) 



(1) 



GADT meet subtyping 



24 



We construct a family of contexts {ri)i^j such that the following holds: 

r<Ar,e/(2) Vi, r, h : t;, (3) Vz,rjhr, ^ (=) (4) 

i 

where (^) is equivalent to 

Vz, Va^, p, (r,[p] a,' ^ 3p', p ^r. p' A T,[p'] = (5) 

The first step is to move from the entailment (|^) of the form Vp, (Vi G /, . . . ) 
(3p', (Vi G /, . . . )) to the weaker form (Vi G /, Vp, . . . =4' 3p', . . . ), but closer to 
(H). More precisely, we show that 

Vz, Va,',p, (T,[p] a,' =^ 3p', p -^r p' A T,[p'] = a[) (6) 

that is, Vi, T Ih : (=) (7). Let i, cr,', and p be such that T^p] -<^, cr- (8). We 
show that there exists a p' such that Ti\p\ = cr,- (9) and p -<r p' (10). Let us extend 
our type fj[ to a family ct', defined by taking cr^- equal to Tj[p] for j in / \ {i}. By 
construction, we have (Vj G I,Tj\p\ <vj^)- Therefore, we may apply (|l|) to get a 
p' such that p p', *-e- our first goal ( pi)|), and (Vj, rj[p'] = cr^), which implies our 
second goal when j is i. This proves (|6|). 

We now prove that we can refine this to have p -<Yi p' for (ri)^^/ such that F < Xi T^. 

Let Ai and A2 be the most general contexts such both that Ai h Ti : vi and 
A2 h T2 : V2 hold (11). Let @)\ (i)', and (|)' be obtained by replacing F^'s by A^'s 
in our three goals (p|, (||), and (^. In fact (^)' is just (|lT|). By Lemma [TOl applied 
to (|^) twice, given pi|), we have both Ai Ih Ti : vi (=) and A2 Ih r2 : t;2 ^ (=), 
that is, (!). 

Hence Ai and A2 are correct choices for Fi and F2 if they also satisfy the goal (||)', 
i.e. Ai X A2 > F. We now study when remaining goal (||)' holds and, when it does 
not, propose a different choice for Fi and F2 that respect all three goals. 

W.l.o.g., we assume that / is reduced to {1,2} and that there is only one free 
variable /? in Ti,T2. Since we focus on a single variable of the context we name 
wi and W2 the variances of /? in Ai and A2, respectively. We now reason by case 
analysis on the variances wi and W2. 

If both of them are n, we have Ai X ^2 = (x/?), so we do not necessarily have 
F < Ai X ^2. Instead, we make a different choice for G2. Namely, we pick F 
for F2 and keep Ai for Fi. As A2 is n we have A2 < F2, so by monotonicity of 
the variance checking judgment we have F2 h T2 : V2 from (^l|), and we still have 
F2 Ih T2 : V2 {—) from (^. Hence (^ and (^ are reestablished. Finally, we have 
Fi X F2 = F, so in particular F < Fi X r2, i.e. (|). 

If only one of the is x, we may assume, w.l.o.g., that it is wi. Then Ai X A2 is A2 
and we only need to show that F < A2 (12). From (0), we have F Ih T2 : ^2 =4' (=). 
We then make a case analysis on V2'. if V2 is not (=), then by since A2 is most 
general and W2 ^ x, wc may apply Lemma |8] to get (|2|); Otherwise, V2 is {—); 
Lemma [Til applied to ([ll| ) implies that A2 is itself (=/3), and ([l^ ) trivially holds. 

Finally, if none of the is N , we first prove that they are both (=). In fact, we only 
prove that wi is {—) (13), as the other case follows by symmetry. To prove (|l3|), 
we assume that p" be such that p -<^j p" and we show that p = p" (14) holds. 
By (|l|), we have Ti[p] Ti[p"]. By reflexivity, we have r2[p] -<^2 T2[p]. We can 
use those two inequalities to invoke our simultaneous decomposability hypothesis (P 
with Ti[p"] for a[ and T2[p] for cr^ to get a p' such that both Ti[p'] = Ti[p"] and 
72 [p'] = 72 [p] hold. By Lemma |9] applied with (|l^), this implies both p' = p" and 
p' = p, and therefore (p^). 
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Therefore, the only remaming case is when wi and W2 are both (=). Then Ai X ^2 
is {—(3), which is the highest single- variable context. So our goal (|])' trivially holds. 

Of these several cases, one (n, x) has Fi X r2 = F directly, and in the others 
Fi and F2 were defined as the most general contexts such that Fi h Ti : vi and 
F2 h r2 : U2. If we add the further hypothesis that for each i, F h : holds, then 
by principality of the F^, we have that F^ < F for each i. This implies that we have 
(AiG/ Ti) < F (when it is defined, X coincides with the lowest upper bound A). By 
combination with (||), we get (Xie/ Ti) = T- ■ 

Lemma 12 (Completeness of syntactic decomposability) 

// |F h T : u u'l holds for v' G {=, n}, then F h r : w u' is provable. 



Proof : Assume [F h r : w w'| holds for v' G {=, k}, i.e. F h r : w (1) and 
F Ih T : w (2), which expands to 

V(p : F),t', T[p] t' =^ 3ip' : F), p p' A r[p'] r' (3) 

We show F h r : u u' (4) by structural induction on r 

If V > v' holds, then (||) directly follows from Rule lsc-TRivl This applies in partic- 
ular when v' — \A. Hence, we only need to consider the remaining cases where v' is 
(=) and w ^ w' 

We now reason by cases on t. 

Case T is a variable a. (jsj) becomes 

Vp,r', p<^t' =^ V, p<Tp' A p' = t' 

This means that if p r' holds then p -<r t' also holds: the variance wa £ F 
satisfies v > w. Since, the hypothesis (0) implies v < w, we have v = w. Therefore, 
(H) follows by Rule Isc-VarI 

Case T is of the form a t. By inversion, the derivation of (|l|) must end with rule 
Ivc-ConstrI hence we have F h cr, : v.Wi (5) for each i £ I with F h type wa t (6). 

Let us show that F Ih (ai : v.Wi ~^ —.'Wi)i^i (7), i.e. 

Vp,a', (Vi e /, a^[p] fx^ =^ p -<r p' A (Vi G /, cr,[p'] -<^, a[) 

Let p and ct' be such that cri\p] -<y^ a[ holds for all i in /. From this and (^, we 
have (a t)[p] -<t] ct' t. By applitcation of (|^), there exists p' such that p p' and 
(a t)[p'] ct' t. By inversion of subtyping , this implies Ui\p'\ -<=.wi 0"^, for all i 
in /. This proves (^). We also note that the constructor t is w-closed (8). 

To prove our goal (^), we construct a family (Fi);^/ of contexts that satisfies 
Xiei Ti — r (9) and subderivations F^ h cr^ : v.Wi =.Wi (10), since then the 
conclusion (||) follows by an application of rule Isc-CoNSTRl with (||), (||), and (|l0|). 

We will handle separately the arguments Uj that are irrelevant, i.e. when Wj is n, 
from the rest. Let 1^^ be the set of indices with irrelevant variances and the 
others. 

For any i G Jm, v.Wi and =.Wi are both ixi so the condition (pO|), which becomes 
Fj h (Ti : XI ^ N, is void of content (cri[p] cr^ => 3p', CTi[p'] cr^ is 
always true). More precisely, let F^ be the irrelevant context having only irrelevant 
variances. Then (|l^) follows by Rule Isc-TrivI 

Since the decomposability constraints for i G /m such that Wi = ^ are trivial, (^ 
is equivalent to F Ih (ci : v.Wi =.Wi)i^i^ (H)- 
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For each i e equals (=), so ( pl| ) becomes F Ih {ai : v.Wi (=))ie/^ • By 

zip completeness (Theorem [5]), there is a family (Fi)ig/^ such that F < Aie/^ 
and both F^ h (7^ : w.uii and F^ Ih ai : v.Wi (=), i.e. |Fi h ai : v.Wi (=)| (12) 
hold for any i E I^, Furthermore, since we also have (^, we can strengthen our 
result into Aie/^ = T. By induction hypothesis applied to (|l^), we have (^o|) for 

We have two families of contexts over domains and I^i that partition /; we can 
union them in a family (Fj)jg/ that has subderivations Vi £ /, Fj h (Xj : v.Wi => v' .Wi. 
As the contexts in /j^i are all irrelevant, they are neutral for the zipping operation: 
Xig/ Fi is equal to Xie/^ T^, that is F. This proves (^ while ( p^ ) has already been 
proved separately for i E 1^ and i G /^^ . ■ 

Remark 1 (Note (^) T/ie /learf constructor t is closed in our system with atomic 
subtyping, but the situation is in fact a bit stronger than that: the statement of v- 
closure of a t can be formulated in term of decompos ability F Ih a t : (=). 
It is very close from our decomposability hypothesis T \\- a t : v (=), but uses 
variables a instead of full type expressions a. We conjecture that the decomposability 
hypothesis (with = on the right) implies v-closure in a much larger set of subtyping 
systems that just atomic subtyping: it suffices that the subtyping relation is defined 
only in term of head constructors. 



3.6 Back to the correctness criterion 

Remember the correctness criterion |req-SP] of Simonet and Pottier: 

ya,a',p, iat<a'tAD[a,p] =^ 3p', D[a',p']AT[p]<T[p']) (1) 

We now show how the closure judgment F h t : u w' can be used to verify that 
this criterion holds: we will express this criterion in an equivalent form that uses 
the interpretation of our judgments. 

The first step is to rewrite the property W t <'a' t using the variance annotation 
va of t. Again, we are taking the variance annotation for the datatype t as granted 
(this is why we can use it in this reasoning step), and checking that the definitions 
of the constructors of t are sound with respect to this annotation. 

Va,a',p, ((Vz, a, a',) A D[a,p] =^ 3p\ D[a' ,p'] A t[p] < t[p']) (2) 

Since, the constraint D\a,(3] is a set of equalities of the form ai = Ti[/3] (where T,; 
is a type), (||) is actually: 

Va,a\p, (Vz,a, a',) A (Vz,a, - T,[p]) =^ 3p', (Vz,T,[p'] = a',) A r[p] < t[p'] 

Substituting the equalities and, in particular, removing the quantification on the ct, 
which are fully determined by the equality constraints ct = T[p], we get: 

W,p, (V*,r,[p] al) =^ 3p\ (Vz,T4p'] = a,') A t[p] < r[p'] (3) 

By inversion (Theorem [T]), we may replace the goal t[p] < r[p'] by the formula 
3F, (F h r : +) A (p p')- Moreover, since F h r : + always for for some F, we 
may move this quantification in front. Hence, (^ is equivalent to: 

3T aI'^^"^^ (4) 
' / \ \W,p, (V*,T,[p] a^) =^ 3p\ (Vz,r,[p'] ^ai)Ap-<rp' 

We may recognize in second clause the simultaneous decomposability judgment 
(Definition E]) F h (^^ : v., ^ =)je/. Hence, (^ is in fact: 

3F, F h T : + A rh{T,:v^^ =),g/ (5) 
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Then, comes the dehcate step of this series of equivalent rewriting: 

3r,(r,),e/, As ^^'^ . ^ (6) 

The reverse imiphcation from to is is the zip soundness (Lemma [5]). 

The direct imphcation, from (^) to (^) is more involved: let Tq be such that 
To h r : +. By zip completeness (Theorem [2]) , with the hypotheses of (^, there 
exists a family (Ti)ii^j satisfying the typing, zipping and decomposability of second 
line of (1) with Tq < K.^jT^. We take AiG/Ti for T. Then, from Tq < T we get 
r h T : + by monotonicity (Lemma [2|). 

As a last step, the last conjuncts of (^) are equivalent to Vi, Ti \- Ti : Vi ^ {—) 
by interpretation of syntactic decomposability (Definition [S]) and soundness and 
completeness of zipping flemmas [6l and fT2)) . Therefore, (^ is equivalent to: 

3r,(r,).e/, rhr:(+) A T = A A Vz G /, F, h : =^ (=) (7) 
which is our final criterion. 



Pragmatic evaluation of this criterion This presentation of the correctness 
criterion only relies on syntactic judgments. It is pragmatic in the sense that it 
suggests a simple and direct implementation, as a generalization of the check cur- 
rently implemented in type system engines — which are only concerned with the 
r h r : + part. 

To compute the contexts T and {Ti)i^i existentially quantified in this formula, 
one can use a variant of our syntactic judgments where the environment T is not an 
input, but an output of the judgment; in fact, one should return for each variable 
a the set of possible variances for this judgment to hold. For example, the query 
{? \- a* /3 ref : +) should return {a {+, =}; /3 {=}). Defining those algorith- 
mic variants of the judgments is routine, and we have not done it here. The sets of 
variances corresponding to the decomposability of the (Ti)ig/ (? h : (=)) 
should be zipped together and intersected with the possibles variances for r, re- 
turned by (? h r : -|-). The algorithmic criterion is satisfied if and only if the 
intersection is not empty; this can be decided in a simple and efficient way. 



4 Closed-world vs. open- world subtyping 

4.1 Upward and downward closure in a ML type system 

In the type system we have used so far, all types are both upward and downward- 
closed. Indeed, thanks to the simplicity of our subtyping relation, we have a very 
strong inversion principle: two ground types in a subtyping relation necessarily 
have exactly the same structure. We have therefore completely determined a sound 
variance check for a simple type system with GADT. 

This simple resolution, however, does not hold in general: richer subtyping 
relations will have weaker invertibility properties. As soon as a bottom type _L is 
introduced, for example, such that that for all type a we have _L < cr, downward- 
closure fails for most types. For example, products are no longer downward-closed: 
T \- (j*T > _L does not imply that ± is equal to some a' *t' . Conversely, if one adds 
a top type T, bigger than all other types, then most type are not upward-closed 
anymore. 



GADT meet subtyping 



28 



In OCaml, there is no _L or T typqlj- However, object types and polymorphic 
variant have subtyping, so they are, in general, neither upward nor downward-closed. 
Finally, subtyping is also used in private type definitions, that were demonstrated 
in the example. 

Our closure-checking relation therefore degenerates into the following, quite un- 
satisfying, picture: 

• no type is downward-closed because of the existence of private types; 

• no object type but the empty object type is upward-closed; 

• no arrow type is upward-closed because its left-hand-side would need to be 
downward-closed; 

• datatypes are upward-closed if their components types are. 

From a pragmatic point of view, the situation is not so bad; as our main prac- 
tical motivation for finer variance checks is the relaxed value restriction, we care 
about upward-closure (covariance) more than downward-closure (contravariance). 
This criterion tells us that covariant parameters can be instantiated with covariant 
datatypes defined from sum and product types (but no arrow) , which would satisfy 
a reasonably large set of use cases. 

4.2 A better control on upward and downward-closure 

As explained in the introduction, the problem with the upward and downward 
closure properties is that they are not monotonic: enriching the subtyping lattice of 
our type system does not preserve them. While the core language has a nice variance 
check for GADT, adding private types in particular destroys the downward-closure 
property of the whole type system. 

Our proposed solution to this tension is to give the user the choice to locally 
strengthen negative knowledge about the subtyping relation by abandoning some 
flexibility. Just as object-oriented languages have a concept of final classes that 
cannot be extended, we would like to allow to define downward-closed datatypes, 
whose private counterparts cannot be declared, and upward-closed datatypes that 
cannot be made invisible: defining type t = private r would be rejected by the 
type-checker if r was itself declared downward-closed. 

Such "closure specifications" are part of the semantic properties of a type and 
would, as such, sometimes need to be exposed through module boundaries. It is 
important that the specification language for abstract types allow to say that a type 
is upward-closed (respectively downward-closed). These new ways to classify types 
raise some software engineering questions. When is it desirable to define types as 
upward-closed? The user must balance its ability to define semi-abstract version 
of the type against its use in a GADT — and potentially other type-system features 
that would make use of negative reasoning on the subtyping relation. We do not 
yet know how to answer this question and believe that more practice is necessary 
to get a clearer picture of the trade-off involved. 

4.3 Subtyping constraints and variance assignment 

We will now revisit our previous example, using the guarded existential notation: 

^■^A bottom type would be admissible, but a top type would be unsound in OCaml, as different 
types may have different runtime representations. Existential types, that may mix values of 
different types, are constructed explicitly through a boxing step. 
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type a expr = 

I Val of 3/3[a = /3]./3 

I Int of [a = int]. int 

I Thunk of 3/3j[a — j]. /3 expr * (/3 — > 7) 

I Prod of 3/37[q! = /3 * 7]. /? expr * 7 expr 

A simple way to get such a type to be covariant would be, instead of proving delicate, 
non-monotonic upward-closure properties on the tuple type involved in the equation 
a = /3*7, to change this definition so that the resulting type is obviously covariant: 
type +a expr = 

I Val of 3/3[a > /3].^ 

I Int of [a > int]. int 

I Thunk of 3/3j[a > 7]. /? expr * (/3 — > 7) 

I Prod of 3/3"f[a > /3 * 7]. /? expr * 7 expr 

We have turned each equality constraint a — T[/3] into a subtyping constraint 
a > T[(3]. For a type a' such that a < a', we get by transitivity that a' > T[/3]. 
This means that a expr trivially satisfies the correctness criterion from Simonet 
and Pottier. Formally, instead of checking T \- Ti : Vi =^ (==), we are now checking 
T \- Ti : Vi ^ (+), which is significantly easier to satisfM^4 when Vi is itself + we 
can directly apply the lsc-Trnvl rule. 

While we now have a different datatype, which gives us a weaker subtyping 
assumption when pattern-matching, we are still able to write the classic function 
eval : a expr — a, because the constraints a > t are in the right direction to get 
an a as a result. 

let rec eval : a expr — > a = function 
I Val /3 (v : /3) -> (v :> a) 
I Int (n : int) -> (n :> a) 
I Thunk /37 ( (v : /3 expr) , (f : /? ^ 7) ) -> 

(f (eval v) :> a) 
I Prod /? 7 ( (b : /3 expr) , (c : 7 expr)) -> 

((eval b, eval c) :> a) 

We conjecture that moving from an equality constraint on the GADT type pa- 
rameters to a subtyping constraint (bigger than, or smaller than, according to the 
desired variance for the parameter) is often unproblematic in practice. In the exam- 
ples we have studied, such a change did not stop functions from type-checking — we 
only needed to add some explicit coercions. 

However, allowing subtyping constraints in GADT has some disadvantages. If 
the language requires subtyping casts to be explicit, this would make pattern match- 
ing of GADT syntactically heavier than with current GADT where equalities con- 
straints are used implicitly. This is related to practical implementation questions, 
as languages based on inference by unification tend to favor equality over subtyping, 
bidirectional coercions over unidirectional ones. Subtyping constraints need also be 
explicit in the type declaration, forcing the user out of the convenient "generalized 
codomain type" syntax. 

From a theoretical standpoint, we think there is value in exploring both direc- 
tions: experimenting with GADT using subtyping constraints, and with fine-grained 
closure properties for equality constraints. Both designs allow to reason in an open 
world setting, by being resilient to extensions of the subtyping relation. Whether 
it is possible to expose those features to the expert language user {e.g. library 

Note that the formal proofs of the precedent section were, in some cases, specialized to the 
equality constraint. More precisely, our decomposability criterion is still sound when extended to 
arbitrary subtyping constraints, but its completeness is unknown and left to future work. 
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designers) without forcing all users to pay the complexity burden remains to be 
seen. 

5 Future Work 

Extension of the formal exposition to non-atomic subtyping As remarked 
in ij2.1l during the definition of our formal subtyping relation, the soundness proof 
of Simonet and Pottier is restricted to atomic subtyping. We conjecture that their 
work can be extended to non-atomic subtyping, and furthermore that our results 
would extend seamlessly in this setting, thanks to our explicit use of the w-closure 
hypothesis. 

On the relELxed value restriction Regarding the relaxed value restriction, 
which is our initial practical motivation to investigate variance in presence of GADT, 
there is also future work to be done to verify that it is indeed compatible with this 
refined notion of variance. While the syntactic proof of soundness of the relaxation 
doesn't involve subtyping directly, the "informal justification" for value restriction 
uses the admissibility of a global bottom type _L to generalize a covariant unification 
variable; in presence of downward-closed type, there is no such general _L type (only 
one for non-downward-closed types). We conjecture that the relaxed value restric- 
tion is still sound in this case, because the covariance criterion is really used to rule 
out mutable state rather than subtype from a _L type; but it will be necessary to 
study the relaxation justification in more details to formally establish this result. 

Experiments with w-closure of type constructors as a new semantic prop- 
erty In a language with non-atomic subtyping such as OCaml, we need to distin- 
guish ti-closed and non-w-closed type constructors. This is a new semantic property 
that, in particular, must be reflected through abstraction boundaries: we should be 
able to say about an abstract type that it is t;-closed, or not say anything. 

How inconvenient in practice is the need to expose those properties to have 
good variance for GADT? Will the users be able to determine whether they want 
to enforce w-closure for a particular type they are defining? 

Experiments with subtyping constraints in GADT In ^4.3[ we have pre- 
sented a different way to define GADT with weaker constraints (simple subtyping 
instead of equality) and stronger variance properties. It is interesting to note that, 
for the few GADT examples we have considered, using subtyping constraints rather 
than equality constraints was sufficient for the desired applications of the GADT. 

However, there are cases were the strong equality relying on fine-grained closure 
properties is required. We need to consider more examples of both cases to evalu- 
ate the expressiveness trade-off in, for example, deciding to add only one of these 
solutions to an existing type system. 

On the implementation side, we suspect that adding subtyping constraints to a 
type system that already supports GADT and private types should not require large 
engineering efforts (in particular, it does not implies supporting the most general 
forms of bounded polymorphism). Matching on a GADT a t already introduces 
local type equalities of the form a = T[/3] in pattern matching clauses. Jacques 
Garrigue suggested that adding an equality of the form a = private T[/3] should 
correspond to GADT equations of the form a < T[/3], and lower bounds could be 
represented using the dual notion of invisible types. Regardless of implementa- 
tion difficulties, in a system with only explicit subtyping coercion, such subtyping 
constraints would still require more user annotations. 
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Mathematical structures for variance studies There has been work on more 
structured presentation of GADT as part of a categorical framework ([GJ08' and 
[HFllj ). This is orthogonal to the question of variance and subtyping, but it may 
be interesting to re-frame the current result in this framework. 

Parametrized types with variance can also be seen as a sub-field of order theory 
with very partial orders and functions with strong monotonicity properties. Finally, 
we have been surprised to find that geometric intuitions were often useful to direct 
our formal developments. It is possible that existing work in these fields would 
allow us to streamline the proofs, which currently are rather low-level and tedious. 

Completeness of variance annotations with domain information For sim- 
ple algebraic datatypes, variance annotations are "enough" to say anything we want 
to say about the variance of datatypes. Essentially, all admissible variance relations 
between datatypes can be described by considering the pairwise variance of their 
parameters, separately. 

This does not work anymore with GADT. For example, the equality type (a, /3) eq 
cannot be accurately described by considering variation of each of its parameters 
independently. We would like to say that (a,/3) eq < (a',/3') eq holds as soon as 
a — P and a' — j3' . With the simple notion of variance we currently have, all we can 
soundly say about eq is that it must be invariant in both its parameters — which is 
considerably weaker. In particular, the well-known trick of "factoring out" GADT 
by using the eq type in place of equality constraint does not preserve variances: 
equality constraints allow fine-grained variance considerations based on upward or 
downward-closure, while the equality type instantly makes its parameters invari- 
ant. 

We think it would possible to regain some "completeness", and in particular 
re-enable factoring by eq, by considering domain information, that is information 
on constraints that must hold for the type to be inhabited. If we restricted the 
subtyping rule with conclusion ct t < ct' t to only cases where a t and a' t are 
inhabited — with a separate rule to conclude subtyping in the non-inhabited case — 
we could have a finer variance check, as we would only need to show that the criterion 
of Simonet and Pettier holds between two instances of the inhabited domain, and 
not any instance. If we stated that the domain of the type (a, /3) eq is restricted by 
the constraint a = /3, we could soundly declare the variance (no;, n/3) eq on this 
domain — which no longer prevents from factoring out GADT by equality types. 

Conclusion 

Checking the variance of GADT is surprisingly more difficult (and interesting) than 
we initially thought. We have studied a novel criterion of upward and downward 
closure of type expressions and proposed a corresponding syntactic judgment that 
is easily implement able. We presented a core formal framework to prove both 
its correctness and its completeness with respect to the more general criterion of 
Simonet and Pottier. 

This closure criterion exposes important tensions in the design of a subtyping 
relation, for which we previously knew of no convincing example in the context 
of ML-derived programming languages. We have suggested new language features 
to help alleviate these tensions, whose convenience and practicality is yet to be 
assessed by real- world usage. 

Considering extension of GADT in a rich type system is useful in practice; it is 
also an interesting and demanding test of one's type system design. 
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